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Field of the Invention 

[0001] This invention relates to a method and apparatus for providing 
simultaneous multi-path inputs in a system with multiple simultaneous input and 
output modes such as voice and display/keyboard, or system where many users may 
interact with the same sessions, such as a remote learning session where multiple 
students attend a given class. This invention resolves the issue of two or more input 
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values from multiple different devices or users attempting to occupy the same 
location. 

Background of the Invention 

[0002] Various services now provide voice and non-voice access to Internet 
data. A caller may access a "Voice Portal" or "Voice Site" by simply dialing a 
number advertised by the company providing the Voice Access service. The caller 
will hear a greeting that requests the caller to "speak" or "enter" specific commands. 
As an example a caller may ask the system to provide him/her with the latest whether 
information by simply speaking a command, or pressing a DTMF button on the 
phone. The information provided to the user may be pre-recorded and accessed from a 
database, or it may be accessed from a page similar to those available on the Internet. 
The mark-up language used to code the page may be VoiceXML or any other type of 
XML-based coding language. Some legacy systems may use proprietary or less 
commonly used methods for connecting the system to back-end data servers. 
[0003] However in all existing systems users interact with data only through 

one interface, it is either a voice interface (e.g. a telephone) or data interface (e.g. an 
Internet browser). This single mode interaction causes limitations on delivery of 
services to users. As an example a user who is driving a car may ask for address 
information between point A and point B by issuing voice commands, and hear back 
the directions read to him via a speaker in the car. However the same navigation 
information would not be available in graphical format. Another example is a user 
who is using a data enabled mobile phone to review his investment portfolio. The user 
may wish to see the data, but input the queries by simply speaking them into the 
phone. Current systems do not allow for such capability. 



[0004] Another limitation of existing systems is that they do not allow more 
than one user to interact with an application in one session. The current invention 
makes this possible. One example of where this may be required is a cooperative form 
filling application where two users need to be logged onto the same session, and each 
answers specific questions as they are presented. The current inventions makes it 
possible for the two attendants to call into the system, and interact with the same 
application through a single session, thereby filling one form by two users. 
[0005] The problem that arises in multi-modal or multi-user interaction with a 
single session (as in the above examples) is that multiple input values may be received 
for the same query through different channels. A simple solution would be to accept 
the first chronologically arriving input value, and discard the subsequent ones. This 
solution, however, fails when there are many rounds of query-input in the same 
application. Consider the case of a query A followed by two inputs a-l and a-2. Input 
a-l is accepted, but before input a-2 arrives in the system, another query B is made. 
Now input a-2 arrives in the system followed by a valid input b-l. The system would 
accept false input a-2, and discard valid input b-l. Fig. 1 illustrates when "Accept 
First Input" fails in the case of multiple queries and inputs. 

[0006] The solution to this problem is to identify every input with the name of 
the query that it is attempting to address. In this case the system would know that the 
second a-2 input is not intended for query B, would discard it, and would accept the 
valid input b-l. 

[0007] However this solution also falls short when the same dialog is repetitively 
used. For example if the system makes a query A for the first time (designated as Al). 
Two responses al-1 and al-2 are sent back. Response a-l is accepted as valid, but 
before response al-2 arrives, the system repeats the same dialog, repeating query A 



(designated as A2). User(s) reply with a response a2-l. However false response al-2 
arrives first, is accepted as valid, and valid input a2-l is discarded as invalid. Fig. 2 
"Accept Tagged Input" fails when the same dialog is repeated. 

Summary of the Invention 

[0008] The present invention resolves the above-descrihed problem by adding a 
query turn indicator (invocation counter) to each input name, and then accepting only 
those response values whose tag and turn indicator match the expected input. For 
example, when query A is made for the first time, the system registers an open slot for 
input values matching query A-l. User inputs are all tagged in the same fashion. So 
all inputs in response to query A-l would be tagged as A-l-n (n being the path 
identifier). Fig. 3 illustrates how "Accept Tagged Input with Turn Indicator" accepts 
only the proper input. As seen in Fig. 3, this method allows the system to properly 
identify the response values, and discard the false ones. 

[0009] The present invention is a gateway for simultaneous multi-path, multi- 
modal, and multi-user access to data and applications via non-voice and voice 
devices. It takes into account the issue of processing inputs collected from multiple 
users or multiple devices when they are all interacting with the same single 
application in the same single session. It also resolves the issue of simultaneous inputs 
made in response to consecutive queries in different states. 

[0010] The method of simultaneous multi-path inputs enables inputs to be made 
via any voice interface or data interface devices (e.g. phone, Keyboard, PDA, etc.) 
used during the same session. The inputs are fed to the session object, and given 
unique identifiers, or unique "tickets" for each input. An invocation counter in turn 



tracks when the inputs are made. The inputs and their associated identifiers are stored 
in the memory of the session object. 

[0011] Each time an input is received in response to a query, the gateway 
checks to see whether the input ticket matches that of the most recent query. If so, the 
input is accepted, otherwise, the input is discarded. 

Brief Description of Drawings 

[0012] The various aspects, advantages and novel features of the present 
invention will be more readily comprehended from the following detailed 
description when read in conjunction with the appended drawings, in which: 
[0013] Fig. 1 illustrates when "Accept First Input" fails in the case of two 
consecutive queries and two nearly simultaneous responses to the first query, and how 
the second input (made in response to the first query) may be accepted by mistake as a 
valid response to the second query; 

[0014] Fig. 2 illustrates when "Accept Tagged Input" fails when the same 
dialog is repeated, and that, if the input data is tagged to the dialog alone, a problem 
occurs if the same dialog is repeated twice, and the second input (made in response to 
the first query) is accepted by mistake as a valid response to the second query; 
[0015] Fig. 3 illustrates when "Accept Tagged Input with Turn Indicator" 
accepts only the proper input; and 

[0016] Fig. 4 illustrates how Inputs A and B are made simultaneously, and how 
Input B attempts to occupy the same location as Input A but is discarded in 
accordance with the present invention. 
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Detailed Description of Preferred Embodiments 

[0017] With Reference to Fig. 4 a software method is provided in accordance with 
the present invention to allow one or more users to interact with data and applications 
in using multiple modes of interaction (voice, data, etc.) simultaneously. The solution 
comprises four main components, that is, 

a Session Management Gateway (7) capable of interacting with an 
application (9) from the one side (using standard Internet protocols 
for connection to Internet based applications) and multiple client 
i:i interfaces such as a Telephone Interface (3) or a Data Device 

6 

W interface (5) from the other side, and also capable of maintaining the 
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transaction session with the Application (9) separate from interaction 

\m 

sessions with client devices, and capable of maintaining the 
ry interaction session with the application (9) in a database (8) even if 

a 

riiQ no client device is connected at that moment to the session pertaining 

i'U to the said transaction. 

a Data Device Interface capable of interacting with data devices 
equipped with display, keyboard, sound interface, location sensor, 
etc. Data device may have any combination of one or more human or 
machine data sources which can relay user input (e.g. a keyboard) or 
produce data automatically (e.g. a location sensor) as well as modules 
which can present data (e.g. a display that shows the data to a human, 
or a relay that uses the data to control an engine), 
a Telephony Interface that allows callers to access their sessions 
using any type of voice interface devices (e.g. a mobile phone (1)), 
and is capable of presenting the data to the user in audible fashion, 



and also capable of collecting input from the user in spoken fashion 
(spoken commands) as well as other forms such as DTMF input. 
A Database (8) which maintains transaction sessions controlled by 
Session Management Gateway (7). 

[0018] With reference to Fig. 4, a software system for voice and non-voice access 
to data and application for simultaneous multi-path user access and uses a ticketing 
mechanism to track the order of the inputs. Voice devices can include telephone- 
based devices as well as microphone access to an Internet system. Non-voice devices 
can include keyboard, Personal Digital Assistant (PDA). As shown in Fig. 4, Inputs 
A and B are made simultaneously. Input B tries to occupy the same location as Input 
A, but is discarded. 

[0019] A simultaneous input session involves concurrent inputs from multiple 
devices. In Fig. 4, the application in the backend (9) issues Query 1 to the Session 
Management Gateway (7). Query is formatted for each client, and is assigned a 
unique identifier and an invocation counter identifying how many times the same 
exact query in this dialog has been visited in this session. The resulting query is then 
sent to all client interfaces currently logged to the transaction session pertaining to 
this interaction. In response to Query 1, user makes an utterance in the mobile phone 
(1) , which is taken as Input A. Each client interface will attach the proper identifier 
and invocation counter to the input phrases arriving from the client interface, and 
sends that to Session Management Gateway (7). 

[0020] Nearly simultaneously with input A, another input is made (Input B) in 
response to the same query from a data device (4). Input A arrives at the Session 
Management Gateway (7). There, the unique identifier and the invocation counter 



assigned to the query are compared to the unique identifier and invocation counter 
assigned to the input. If both match, the input is accepted, and is sent to the 
application. The said unique identifier will then be marked as invalid. The next input 
arriving from a client (Input B) carrying the now invalid unique identifier would be 
discarded. 

[0021] It is obvious to those trained in the art that the same string could contain 
the unique identifier and the invocation counter. 

For example, the user is using the system to order food. The user has both voice and 
keyboard access. When asked what kind of cuisine is desired, the user types, 
"Chinese" (Input A). Because the user is anxious, she simultaneously says into the 
microphone "Chinese" (Input B) during a system delay. The system then prompts, 
"You want to order Chinese food?" The user changes her mind and says, "No." 
When the system returns to the menu, the voice input, "Chinese," arrives (delayed) 
but is denied because it was made during an earlier invocation. 

[0022] Although the present invention has been described with reference to a 
preferred embodiment thereof, it will be understood that the invention is not limited to 
the details thereof. Various modifications and substitutions have been suggested in 
the foregoing description, and others will occur to those of ordinary skill in the art. 
All such substitutions are intended to be embraced within the scope of the invention 
as defined in the appended claims. 



